sink(cloudstorage): add use-table-id-as-path option by yinshuangfei · Pull Request #4356 · pingcap/ticdc

yinshuangfei · 2026-03-05T06:51:54Z

What problem does this PR solve?

Issue Number: close #4357

What is changed and how it works?

Adds config use-table-id-as-path.

The use-table-id-as-path option switches the path to use table_id instead of table_name when it set to true.
With configuration use-table-id-as-path=true in sink uri, for example: --sink-uri="s3://cdc&use-table-id-as-path=true", the cdc path changed from

test_db/table_name/5/2024-01-01/CDC_xxx.json

to

12345/5/2024-01-01/CDC_xxx.json

In this mode, we adjust cloud storage path generation to omit schema when table-id-as-path is enabled and skip DB schema writes.

Adds use_table_id_as_path into v2 API conversion and sink config parsing.
Remove partition_id from the path to prevent duplicates.

Check List

Tests

Unit test

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Summary by CodeRabbit

New Features
- Added optional use-table-id-as-path configuration parameter for cloud storage sink, enabling file organization by numeric table IDs instead of table names.
Bug Fixes
- Enhanced validation for exchange partition operations to properly detect and handle missing source table information.
Tests
- Added test coverage for table ID-based path generation and exchange partition error handling.

Summary by CodeRabbit

New Features
- Optional cloud storage setting to use numeric table IDs in file/directory paths (URI/config/API support).
Bug Fixes
- Stronger validation for exchange-partition DDL events to reject missing/invalid source info.
- DDL/schema writes now include explicit table identifiers and skip database-level schema files when table-ID-as-path is enabled.
- Improved error handling around index/data path generation.
Tests
- Added tests for table-ID paths, skipping DB schemas, and invalid exchange-partition events.

pingcap-cla-assistant · 2026-03-05T06:52:08Z

All committers have signed the CLA.

gemini-code-assist · 2026-03-05T06:52:15Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the cloud storage sink by introducing a new configuration option, use-table-id-as-path. This option provides users with the flexibility to specify whether the output directory structure for table-related data should utilize the table's unique ID instead of its name, which can be beneficial for data management and consistency. The changes involve updating the sink's file writing logic and path generation to respect this new setting, along with corresponding test coverage. Additionally, a critical fix for DDL event processing ensures that exchange partition operations are handled correctly by validating the presence of source table information.

Highlights

New Configuration Option: Introduced use-table-id-as-path for the cloud storage sink, allowing output paths to use table IDs instead of table names for better organization and uniqueness.
Enhanced Path Generation Logic: Updated internal file path generation functions to conditionally use the table's unique ID based on the new configuration setting.
Improved DDL Event Handling: Added validation for ActionExchangeTablePartition DDL events to ensure that source table information is always present, preventing potential errors.
Expanded Test Coverage: Included new unit tests to verify the functionality of table ID-based paths and the robustness of the DDL event validation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

downstreamadapter/sink/cloudstorage/sink.go
- Validated ActionExchangeTablePartition DDL events for missing source table info.
- Modified writeFile function to accept tableID and updated its calls.
downstreamadapter/sink/cloudstorage/sink_test.go
- Added test for use-table-id-as-path option in DDL event writing.
- Included test for invalid ActionExchangeTablePartition DDL events.
pkg/sink/cloudstorage/config.go
- Added UseTableIDAsPath field to urlConfig and Config structs.
- Implemented parsing of use-table-id-as-path from sink URI.
pkg/sink/cloudstorage/config_test.go
- Updated tests to include use-table-id-as-path in URI parsing and config application.
pkg/sink/cloudstorage/path.go
- Introduced generateTablePath to conditionally use table name or ID.
- Updated CheckOrWriteSchema and generateDataDirPath to use the new path generation logic.
pkg/sink/cloudstorage/path_test.go
- Added test for data file path generation using table ID.
pkg/sink/cloudstorage/table_definition.go
- Modified GenerateSchemaFilePath to accept useTableIDAsPath and tableID parameters.
- Updated schema file path generation to use table ID when configured.
pkg/sink/cloudstorage/table_definition_test.go
- Updated tests for GenerateSchemaFilePath to cover the useTableIDAsPath option.

Activity

Implemented a new configuration option for cloud storage sink path generation.
Refactored DDL event handling to include robust validation for exchange partition operations.
Added comprehensive unit tests to ensure the correctness of the new feature and the DDL event fix.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-05T06:52:17Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 29448a0b-8b0c-4950-9963-71a751107e2a

📥 Commits

Reviewing files that changed from the base of the PR and between 1e51de0 and 095fb1b.

📒 Files selected for processing (1)

pkg/sink/cloudstorage/path.go

📝 Walkthrough

Walkthrough

Adds a use-table-id-as-path option across API, config, and cloudstorage sink; path generation and schema/index/data file APIs now accept tableID and return errors; sink writes schema files using explicit tableIDs (exchange requires source TableInfo); tests updated and new validations added.

Changes

Cohort / File(s)	Summary
Configuration & API `pkg/sink/cloudstorage/config.go`, `pkg/sink/cloudstorage/config_test.go`, `api/v2/model.go`, `api/v2/model_test.go`	Add `UseTableIDAsPath` flag to Config and API types, bind `use-table-id-as-path` URI param, merge/apply value, and add tests for parsing and API round-trip.
Path & Table-definition logic `pkg/sink/cloudstorage/path.go`, `pkg/sink/cloudstorage/path_test.go`, `pkg/sink/cloudstorage/table_definition.go`, `pkg/sink/cloudstorage/table_definition_test.go`	Change path APIs to accept `useTableIDAsPath`/`tableID`, add `generateTablePath`, switch to error returns instead of panics, validate schema/version/tableID, and update tests for table-ID-based paths.
Sink implementation & tests `downstreamadapter/sink/cloudstorage/sink.go`, `downstreamadapter/sink/cloudstorage/sink_test.go`	Require source TableInfo for exchange-partition DDL, write target and source table defs with explicit tableIDs, update `writeFile` signature to accept `tableID`, adjust path-generation calls, and add/extend tests (including invalid exchange cases and table-id-as-path scenarios).
Writer changes `downstreamadapter/sink/cloudstorage/writer.go`	Capture and handle errors from `GenerateIndexFilePath` during flush, log contextual info, and return on failure instead of proceeding with invalid paths.
Config plumbing & CLI `pkg/config/sink.go`, `pkg/config/changefeed.go`	Add `use-table-id-as-path` SinkURI key and `SinkConfig` field, parse and propagate the param, include in compatibility checks, and clear field in rmStorageOnlyFields for non-storage sinks.

Sequence Diagram(s)

sequenceDiagram
    participant CDC as CDC/EventProducer
    participant Sink as CloudStorageSink
    participant Path as PathGenerator
    participant Writer as StorageWriter

    CDC->>Sink: Emit DDL event (includes tableID / exchange info)
    Sink->>Sink: Validate event (ensure exchange source TableInfo present)
    Sink->>Path: GenerateSchemaFilePath(useTableIDAsPath, tableID)
    Path-->>Sink: schemaFilePath or error
    Sink->>Writer: GenerateIndexFilePath / GenerateDataFilePath
    Writer-->>Sink: index/data path or error
    Sink->>Writer: Write schema/index/data files
    Writer-->>CDC: Ack or error

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested labels

lgtm, approved

Suggested reviewers

wk989898
hongyunyan

Poem

🐰 I hopped through configs with a tiny cheer,

Swapped names for numbers so paths are clear,
I checked each schema and guarded every test,
Tables now find home by ID, not guessed,
🥕 A little rabbit's tidy file-path fest.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 9.09% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately and concisely describes the main change: adding a new use-table-id-as-path option to cloud storage sink configuration.
Description check	✅ Passed	The PR description addresses the main problem, explains how the feature works with clear examples, lists test coverage, and includes issue reference (close `#4357`).
Linked Issues check	✅ Passed	All key requirements from issue `#4357` are met: configuration switch added, path omits schema when enabled, DB schema writes skipped, and unit tests included.
Out of Scope Changes check	✅ Passed	All changes directly support the use-table-id-as-path feature: configuration parsing, path generation logic, validation for exchange partitions, and related test coverage are all in scope.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a use-table-id-as-path option for the cloud storage sink, enabling paths to be constructed using table IDs instead of table names. While the implementation is well-tested and appears correct, a security audit identified that the use of log.Panic for handling invalid or unexpected input from external sources (DDL events or storage files) could lead to Denial of Service (DoS). It is recommended to replace panics with proper error handling to ensure the stability of the replication process. Additionally, there are minor refactoring opportunities to improve code reuse and test structure.

downstreamadapter/sink/cloudstorage/sink.go

pkg/sink/cloudstorage/path.go

pkg/sink/cloudstorage/table_definition.go

downstreamadapter/sink/cloudstorage/sink_test.go

pkg/sink/cloudstorage/table_definition.go

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@downstreamadapter/sink/cloudstorage/sink_test.go`:
- Around line 289-293: The test is missing setup of the mock PD clock before
creating the sink; call appcontext.SetService to register the mock PD clock
service (same mock used in other tests) prior to invoking newSinkForTest so the
sink and any PD clock accessors see the mock; specifically, in the test where
ctx, cancel := context.WithCancel(context.Background()) and cloudStorageSink,
err := newSinkForTest(...) are created, insert the mock PD clock registration
(via appcontext.SetService(mockPDClock)) using the same mock instance used by
TestWriteDDLEvent/TestWriteDDLEventWithTableIDAsPath and ensure it is cleaned up
or reset after the test.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: ef1e6151-8c9d-4aad-8422-14f382fcb166

📥 Commits

Reviewing files that changed from the base of the PR and between 039417c and 74c09c3.

📒 Files selected for processing (8)

downstreamadapter/sink/cloudstorage/sink.go
downstreamadapter/sink/cloudstorage/sink_test.go
pkg/sink/cloudstorage/config.go
pkg/sink/cloudstorage/config_test.go
pkg/sink/cloudstorage/path.go
pkg/sink/cloudstorage/path_test.go
pkg/sink/cloudstorage/table_definition.go
pkg/sink/cloudstorage/table_definition_test.go

downstreamadapter/sink/cloudstorage/sink_test.go

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/sink/cloudstorage/table_definition.go`:
- Around line 334-343: The GenerateSchemaFilePath function allows
useTableIDAsPath to produce a "0" path when tableID is zero; add a guard at the
top of GenerateSchemaFilePath to validate tableID (e.g., tableID > 0) when
useTableIDAsPath is true and return a clear error if invalid, before calling
generateTablePath or computing checksum, so callers fail fast instead of writing
into an ambiguous "0" directory; reference: GenerateSchemaFilePath,
useTableIDAsPath, tableID, and generateTablePath.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 402c67b5-c910-49f1-b230-ddc3b6062f96

📥 Commits

Reviewing files that changed from the base of the PR and between 74c09c3 and 9326b6d.

📒 Files selected for processing (2)

downstreamadapter/sink/cloudstorage/sink_test.go
pkg/sink/cloudstorage/table_definition.go

pkg/sink/cloudstorage/table_definition.go

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/sink/cloudstorage/table_definition_test.go`:
- Around line 524-529: The test assertions in table_definition_test.go expect
the old error text "tableID must be positive" but GenerateSchemaFilePath now
returns "invalid table id for table-id path"; update the two assertions that
call GenerateSchemaFilePath (the cases with tableID 0 and -1) to assert that
err.Error() contains "invalid table id for table-id path" instead of the old
string so the test matches the current implementation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 213ab268-deb6-4709-a28f-0f9b8b979e5f

📥 Commits

Reviewing files that changed from the base of the PR and between 7296adb and 1e5ab99.

📒 Files selected for processing (2)

pkg/sink/cloudstorage/table_definition.go
pkg/sink/cloudstorage/table_definition_test.go

pkg/sink/cloudstorage/table_definition_test.go

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/sink/cloudstorage/table_definition_test.go (1)
522-529: Add one test that documents intended behavior when tableID is unset.

These assertions currently codify hard-fail on tableID == 0/-1. Please add a companion case that captures the intended production behavior for missing IDs (fallback vs fail-fast), so the contract stays explicit.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/sink/cloudstorage/table_definition_test.go` around lines 522 - 529, Add a
companion unit test for the missing/unset tableID case that calls
def.GenerateSchemaFilePath(false, 0) (i.e., use-table-id-as-path = false with
tableID == 0) and assert the function returns no error and a schema path that
uses the table name/default fallback rather than failing; this makes the
contract explicit alongside the existing failure assertions for
GenerateSchemaFilePath(true, 0) and GenerateSchemaFilePath(true, -1).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/sink/cloudstorage/table_definition.go`:
- Around line 355-360: The check that hard-fails when tableID <= 0 inside the
useTableIDAsPath branch should be changed to gracefully fall back to using the
table name path instead of returning
errors.ErrInternalCheckFailed.GenWithStackByArgs; update the logic around the
useTableIDAsPath && t.Table branch (referencing useTableIDAsPath, t.Table,
tableID, and the current error return) so that if tableID is unset/<=0 you skip
the error and proceed to generate a path based on t.Table (or the existing
name-based path logic) rather than aborting.

---

Nitpick comments:
In `@pkg/sink/cloudstorage/table_definition_test.go`:
- Around line 522-529: Add a companion unit test for the missing/unset tableID
case that calls def.GenerateSchemaFilePath(false, 0) (i.e., use-table-id-as-path
= false with tableID == 0) and assert the function returns no error and a schema
path that uses the table name/default fallback rather than failing; this makes
the contract explicit alongside the existing failure assertions for
GenerateSchemaFilePath(true, 0) and GenerateSchemaFilePath(true, -1).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77b845f5-35c1-4489-aa13-e29feadc64c4

📥 Commits

Reviewing files that changed from the base of the PR and between 1e5ab99 and 3302635.

📒 Files selected for processing (3)

pkg/sink/cloudstorage/path.go
pkg/sink/cloudstorage/table_definition.go
pkg/sink/cloudstorage/table_definition_test.go

pkg/sink/cloudstorage/table_definition.go

tenfyzhong · 2026-03-05T08:55:35Z

/check-issue-triage-complete

…-path'

ti-chi-bot · 2026-03-06T04:36:57Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign flowbehappy for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS
pkg/config/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

yinshuangfei · 2026-03-06T05:29:19Z

@coderabbitai review

coderabbitai · 2026-03-06T05:29:29Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

yinshuangfei · 2026-03-06T05:33:56Z

@coderabbitai resume

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/sink/cloudstorage/path.go (1)
398-418: ⚠️ Potential issue | 🟠 Major

Avoid duplicating the physical table ID in partition paths.

With both UseTableIDAsPath and EnablePartitionSeparator enabled, Line 400-403 and Line 417 both append tbl.TableNameWithPhysicTableID.TableID. For partitioned tables this generates <physicalTableID>/<version>/<physicalTableID>/..., so the extra segment stops carrying distinct partition information and changes the path layout unexpectedly.
💡 One safe direction if physical table ID is already the top-level key
-	if f.config.EnablePartitionSeparator && tbl.TableNameWithPhysicTableID.IsPartition {
+	if f.config.EnablePartitionSeparator &&
+		tbl.TableNameWithPhysicTableID.IsPartition &&
+		!f.config.UseTableIDAsPath {
 		elems = append(elems, fmt.Sprintf("%d", tbl.TableNameWithPhysicTableID.TableID))
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/sink/cloudstorage/path.go` around lines 398 - 418, The code appends the
physical TableID twice when both f.config.UseTableIDAsPath and
f.config.EnablePartitionSeparator are true; update the partition-separator block
so it only appends the extra TableID when it is not already included by
generateTablePath. Concretely, in the branch that currently appends
fmt.Sprintf("%d", tbl.TableNameWithPhysicTableID.TableID) under
f.config.EnablePartitionSeparator, add a guard to skip this append when
f.config.UseTableIDAsPath is true (or when generateTablePath was called with the
TableID-included flag), referencing f.config.UseTableIDAsPath,
f.config.EnablePartitionSeparator, tbl.TableNameWithPhysicTableID.TableID,
generateTablePath, and f.versionMap[tbl].

🧹 Nitpick comments (1)

pkg/sink/cloudstorage/path.go (1)

218-232: Remove the second invalid-table-ID check.

Line 218 already routes through def.GenerateSchemaFilePath(...), which is the canonical validation point for use-table-id-as-path. The extra branch at Line 230-232 is dead on this path and duplicates the error literal, so the two sites can drift again.

♻️ Suggested cleanup

 	tblSchemaFile, err := def.GenerateSchemaFilePath(f.config.UseTableIDAsPath, table.TableNameWithPhysicTableID.TableID)
 	if err != nil {
 		return false, err
 	}
 	exist, err := f.storage.FileExists(ctx, tblSchemaFile)
 	if err != nil {
 		return false, err
 	}
 	if exist {
 		f.versionMap[table] = table.TableInfoVersion
 		return false, nil
 	}
-	if f.config.UseTableIDAsPath && table.TableNameWithPhysicTableID.TableID <= 0 {
-		return false, errors.ErrInternalCheckFailed.GenWithStackByArgs("invalid table id for table-id path")
-	}

Based on learnings: GenerateSchemaFilePath(useTableIDAsPath bool, tableID int64) intentionally returns ErrInternalCheckFailed when tableID <= 0 and useTableIDAsPath=true.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/sink/cloudstorage/path.go` around lines 218 - 232, The code duplicates
validation for table IDs: GenerateSchemaFilePath(f.config.UseTableIDAsPath,
table.TableNameWithPhysicTableID.TableID) already returns ErrInternalCheckFailed
when UseTableIDAsPath is true and tableID <= 0, so remove the redundant
conditional that checks f.config.UseTableIDAsPath &&
table.TableNameWithPhysicTableID.TableID <= 0 (the branch after exist check that
returns ErrInternalCheckFailed) to avoid dead/duplicated validation; keep the
call to def.GenerateSchemaFilePath, the FileExists check (f.storage.FileExists),
and the versionMap assignment (f.versionMap[table] = table.TableInfoVersion)
untouched.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@downstreamadapter/sink/cloudstorage/sink.go`:
- Around line 227-234: The early-return in sink.writeFile incorrectly uses
!def.IsTableSchema(), causing invalid table-level DDLs to be silently skipped in
UseTableIDAsPath mode; change the condition to explicitly check for a true
database-level DDL by using def.Table == "" (and keep the surrounding
cfg.UseTableIDAsPath check) so only genuine schema-level events are skipped and
invalid table defs will proceed to GenerateSchemaFilePath and surface validation
errors; reference symbols: sink.writeFile, cfg.UseTableIDAsPath, def.Table,
def.IsTableSchema(), GenerateSchemaFilePath.

In `@pkg/config/sink.go`:
- Around line 954-965: The compatibility check must include the new
UseTableIDAsPath flag so sink-URI-only changes are detected: update
CheckCompatibilityWithSinkURI to parse UseTableIDAsPath from params (as done in
the shown block) and compare it against the current s.UseTableIDAsPath value; if
they differ, populate cfgInSinkURI and cfgInFile for UseTableIDAsPath (similar
to Protocol and TxnAtomicity) and do not let the function return early ignoring
this difference. Ensure you reference the same symbols (UseTableIDAsPath,
CheckCompatibilityWithSinkURI, cfgInSinkURI, cfgInFile, params.Get) when adding
this comparison so URI-driven toggles are correctly treated as
compatibility-breaking.

---

Outside diff comments:
In `@pkg/sink/cloudstorage/path.go`:
- Around line 398-418: The code appends the physical TableID twice when both
f.config.UseTableIDAsPath and f.config.EnablePartitionSeparator are true; update
the partition-separator block so it only appends the extra TableID when it is
not already included by generateTablePath. Concretely, in the branch that
currently appends fmt.Sprintf("%d", tbl.TableNameWithPhysicTableID.TableID)
under f.config.EnablePartitionSeparator, add a guard to skip this append when
f.config.UseTableIDAsPath is true (or when generateTablePath was called with the
TableID-included flag), referencing f.config.UseTableIDAsPath,
f.config.EnablePartitionSeparator, tbl.TableNameWithPhysicTableID.TableID,
generateTablePath, and f.versionMap[tbl].

---

Nitpick comments:
In `@pkg/sink/cloudstorage/path.go`:
- Around line 218-232: The code duplicates validation for table IDs:
GenerateSchemaFilePath(f.config.UseTableIDAsPath,
table.TableNameWithPhysicTableID.TableID) already returns ErrInternalCheckFailed
when UseTableIDAsPath is true and tableID <= 0, so remove the redundant
conditional that checks f.config.UseTableIDAsPath &&
table.TableNameWithPhysicTableID.TableID <= 0 (the branch after exist check that
returns ErrInternalCheckFailed) to avoid dead/duplicated validation; keep the
call to def.GenerateSchemaFilePath, the FileExists check (f.storage.FileExists),
and the versionMap assignment (f.versionMap[table] = table.TableInfoVersion)
untouched.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a51876cc-08fb-4742-acf1-cc189eb723ed

📥 Commits

Reviewing files that changed from the base of the PR and between 3302635 and 2cc7aa9.

📒 Files selected for processing (11)

api/v2/model.go
api/v2/model_test.go
downstreamadapter/sink/cloudstorage/sink.go
downstreamadapter/sink/cloudstorage/sink_test.go
pkg/config/changefeed.go
pkg/config/sink.go
pkg/sink/cloudstorage/config.go
pkg/sink/cloudstorage/path.go
pkg/sink/cloudstorage/path_test.go
pkg/sink/cloudstorage/table_definition.go
pkg/sink/cloudstorage/table_definition_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/sink/cloudstorage/path_test.go

downstreamadapter/sink/cloudstorage/sink.go

pkg/config/sink.go

coderabbitai · 2026-03-06T05:41:04Z

✅ Actions performed

Reviews resumed.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

pkg/config/sink.go (1)

954-965: ⚠️ Potential issue | 🟠 Major

Add UseTableIDAsPath to the compatibility delta check.

This block makes use-table-id-as-path URI-governed, but CheckCompatibilityWithSinkURI still only keys off Protocol and TxnAtomicity. A changefeed update that only toggles UseTableIDAsPath can therefore return early and miss a real URI/config conflict.

🐛 Proposed fix

-	cfgParamsChanged := s.Protocol != oldSinkConfig.Protocol ||
-		s.TxnAtomicity != oldSinkConfig.TxnAtomicity
+	cfgParamsChanged := s.Protocol != oldSinkConfig.Protocol ||
+		s.TxnAtomicity != oldSinkConfig.TxnAtomicity ||
+		s.UseTableIDAsPath != oldSinkConfig.UseTableIDAsPath

Also applies to: 997-1014

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/config/sink.go` around lines 954 - 965, The change makes UseTableIDAsPath
(s.UseTableIDAsPath) driven from the sink URI but CheckCompatibilityWithSinkURI
still ignores that key, so updates toggling UseTableIDAsPath can bypass
compatibility checks; update CheckCompatibilityWithSinkURI to include
UseTableIDAsPath in the compatibility delta comparison (same style as
Protocol/TxnAtomicity), using the same config key string (UseTableIDAsPathKey)
and comparing util.GetOrZero(s.UseTableIDAsPath) with the parsed/enabled value
from cfgInSinkURI/cfgInFile, and ensure cfgInSinkURI and cfgInFile are populated
consistently where UseTableIDAsPath is parsed so the function can detect and
return conflicts correctly.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/sink/cloudstorage/path.go`:
- Around line 127-131: The generateTablePath function currently returns any
tableID as a string which allows useTableIDAsPath=true to produce "0/..." paths;
change the logic to reject non-positive IDs the same way GenerateSchemaFilePath
does: when useTableIDAsPath is true and tableID <= 0 return an error
(ErrInternalCheckFailed) instead of silently returning "0", and update callers
that build data/index directories (the code paths referenced around the
data/index generation using generateTablePath) to handle/propagate that error so
data/index path construction fails early rather than creating mixed ID/name
paths.

---

Duplicate comments:
In `@pkg/config/sink.go`:
- Around line 954-965: The change makes UseTableIDAsPath (s.UseTableIDAsPath)
driven from the sink URI but CheckCompatibilityWithSinkURI still ignores that
key, so updates toggling UseTableIDAsPath can bypass compatibility checks;
update CheckCompatibilityWithSinkURI to include UseTableIDAsPath in the
compatibility delta comparison (same style as Protocol/TxnAtomicity), using the
same config key string (UseTableIDAsPathKey) and comparing
util.GetOrZero(s.UseTableIDAsPath) with the parsed/enabled value from
cfgInSinkURI/cfgInFile, and ensure cfgInSinkURI and cfgInFile are populated
consistently where UseTableIDAsPath is parsed so the function can detect and
return conflicts correctly.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b4a1808c-9d41-4b3a-904b-92ae604bef64

📥 Commits

Reviewing files that changed from the base of the PR and between 3302635 and 2cc7aa9.

📒 Files selected for processing (11)

api/v2/model.go
api/v2/model_test.go
downstreamadapter/sink/cloudstorage/sink.go
downstreamadapter/sink/cloudstorage/sink_test.go
pkg/config/changefeed.go
pkg/config/sink.go
pkg/sink/cloudstorage/config.go
pkg/sink/cloudstorage/path.go
pkg/sink/cloudstorage/path_test.go
pkg/sink/cloudstorage/table_definition.go
pkg/sink/cloudstorage/table_definition_test.go

pkg/sink/cloudstorage/path.go

tenfyzhong · 2026-03-06T09:22:25Z

pkg/sink/cloudstorage/config.go

+	if urlParameter.UseTableIDAsPath != nil {
+		c.UseTableIDAsPath = *urlParameter.UseTableIDAsPath
+	}


Suggested change

if urlParameter.UseTableIDAsPath != nil {

c.UseTableIDAsPath = *urlParameter.UseTableIDAsPath

}

c.UseTableIDAsPath = util.GetOrZero(urlParameter.UseTableIDAsPath)

Use the util.GetOrZero function to keep the code style

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/sink/cloudstorage/path_test.go (1)

141-166: Add the partitioned-table case here too.

This only locks in the plain-table path shape. The branch most likely to regress is UseTableIDAsPath=true with EnablePartitionSeparator=true and IsPartition=true, where the partition ID must stay omitted so we don't reintroduce tableID/.../<partitionID>/... duplicates.

💡 Suggested test extension

 func TestGenerateDataFilePathWithTableIDAsPath(t *testing.T) {
 	t.Parallel()

 	ctx, cancel := context.WithCancel(context.TODO())
 	defer cancel()
@@
 	date := f.GenerateDateStr()
 	path, err := f.GenerateDataFilePath(ctx, table, date)
 	require.NoError(t, err)
 	require.Equal(t, fmt.Sprintf("12345/5/CDC_%s_000001.json", table.DispatcherID.String()), path)
+
+	partitioned := table
+	partitioned.TableNameWithPhysicTableID.IsPartition = true
+	f = testFilePathGenerator(ctx, t, dir)
+	f.config.UseTableIDAsPath = true
+	f.config.EnablePartitionSeparator = true
+	f.versionMap[partitioned] = partitioned.TableInfoVersion
+
+	path, err = f.GenerateDataFilePath(ctx, partitioned, date)
+	require.NoError(t, err)
+	require.Equal(t, fmt.Sprintf("12345/5/CDC_%s_000001.json", partitioned.DispatcherID.String()), path)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/sink/cloudstorage/path_test.go` around lines 141 - 166, Extend
TestGenerateDataFilePathWithTableIDAsPath to also cover the partitioned-table
case: create a VersionedTableName with IsPartition=true and a non-empty
PartitionID, set f.config.UseTableIDAsPath=true and
f.config.EnablePartitionSeparator=true, populate f.versionMap for that table
version, call f.GenerateDataFilePath(ctx, table, date) and assert no error and
that the resulting path omits the partition ID (matches
"12345/5/CDC_<dispatcher>_000001.json"). This ensures GenerateDataFilePath
respects UseTableIDAsPath + EnablePartitionSeparator when IsPartition is true.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/sink/cloudstorage/path.go`:
- Around line 405-440: generateDataDirPath currently reads f.versionMap[tbl]
directly which yields a zero value (0) when the map hasn't been primed; change
FilePathGenerator.generateDataDirPath to check for presence using the comma-ok
idiom (e.g., v, ok := f.versionMap[tbl]) and return a descriptive error if !ok
(mentioning tbl or that CheckOrWriteSchema must run) so callers of
GenerateDataFilePath / GenerateIndexFilePath fail fast rather than producing a
"0" path segment; keep the rest of the path-building logic unchanged and
reference CheckOrWriteSchema in the error message as the priming step.

---

Nitpick comments:
In `@pkg/sink/cloudstorage/path_test.go`:
- Around line 141-166: Extend TestGenerateDataFilePathWithTableIDAsPath to also
cover the partitioned-table case: create a VersionedTableName with
IsPartition=true and a non-empty PartitionID, set f.config.UseTableIDAsPath=true
and f.config.EnablePartitionSeparator=true, populate f.versionMap for that table
version, call f.GenerateDataFilePath(ctx, table, date) and assert no error and
that the resulting path omits the partition ID (matches
"12345/5/CDC_<dispatcher>_000001.json"). This ensures GenerateDataFilePath
respects UseTableIDAsPath + EnablePartitionSeparator when IsPartition is true.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 769312cb-ecc2-4a88-8268-0dc98f576431

📥 Commits

Reviewing files that changed from the base of the PR and between 2cc7aa9 and 1e51de0.

📒 Files selected for processing (7)

downstreamadapter/sink/cloudstorage/sink.go
downstreamadapter/sink/cloudstorage/writer.go
pkg/config/sink.go
pkg/sink/cloudstorage/config.go
pkg/sink/cloudstorage/path.go
pkg/sink/cloudstorage/path_test.go
pkg/sink/cloudstorage/table_definition.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/config/sink.go

pkg/sink/cloudstorage/path.go

sink(cloudstorage): add use-table-id-as-path option

74c09c3

ti-chi-bot bot added do-not-merge/needs-linked-issue first-time-contributor Indicates that the PR was contributed by an external member and is a first-time contributor. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Mar 5, 2026

ti-chi-bot bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 5, 2026

gemini-code-assist bot reviewed Mar 5, 2026

View reviewed changes

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

downstreamadapter/sink/cloudstorage/sink_test.go Outdated Show resolved Hide resolved

ti-chi-bot bot added do-not-merge/needs-triage-completed and removed do-not-merge/needs-linked-issue labels Mar 5, 2026

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

pkg/sink/cloudstorage/table_definition.go Outdated Show resolved Hide resolved

update

7296adb

yinshuangfei force-pushed the cdc_path_table_id branch from 9326b6d to 7296adb Compare March 5, 2026 07:42

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

pkg/sink/cloudstorage/table_definition_test.go Outdated Show resolved Hide resolved

update

3302635

yinshuangfei force-pushed the cdc_path_table_id branch from 1e5ab99 to 3302635 Compare March 5, 2026 08:33

ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Mar 5, 2026

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

pkg/sink/cloudstorage/table_definition.go Outdated Show resolved Hide resolved

ti-chi-bot bot removed the do-not-merge/needs-triage-completed label Mar 5, 2026

yinshuangfei changed the title ~~sink(cloudstorage): add use-table-id-as-path option~~ DNM: sink(cloudstorage): add use-table-id-as-path option Mar 5, 2026

remove database name in cdc path, add query info for 'use-table-id-as…

2cc7aa9

…-path'

yinshuangfei changed the title ~~DNM: sink(cloudstorage): add use-table-id-as-path option~~ sink(cloudstorage): add use-table-id-as-path option Mar 6, 2026

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

downstreamadapter/sink/cloudstorage/sink.go Show resolved Hide resolved

pkg/config/sink.go Show resolved Hide resolved

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

pkg/sink/cloudstorage/path.go Outdated Show resolved Hide resolved

update

115706e

tenfyzhong reviewed Mar 6, 2026

View reviewed changes

Refactor

1e51de0

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

pkg/sink/cloudstorage/path.go Show resolved Hide resolved

update

095fb1b

Conversation

yinshuangfei commented Mar 5, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

What is changed and how it works?

Check List

Tests

Questions

Will it cause performance regression or break compatibility?

Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Summary by CodeRabbit

Summary by CodeRabbit

Uh oh!

pingcap-cla-assistant bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 5, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tenfyzhong commented Mar 5, 2026

Uh oh!

ti-chi-bot bot commented Mar 6, 2026

Uh oh!

yinshuangfei commented Mar 6, 2026

Uh oh!

coderabbitai bot commented Mar 6, 2026

Uh oh!

yinshuangfei commented Mar 6, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot commented Mar 6, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tenfyzhong Mar 6, 2026

yinshuangfei commented Mar 5, 2026 •

edited by coderabbitai bot

Loading

pingcap-cla-assistant bot commented Mar 5, 2026 •

edited

Loading

coderabbitai bot commented Mar 5, 2026 •

edited

Loading